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Abstract 

The Anti-Nuclear Antibody (ANA) clinical pathology test is 
commonly used to identify the existence of various diseases. 
A hallmark method for identifying the presence ofANAs is 
the Indirect Immunofluorescence method on Human Epithe- 
lial (IIEp-2) cells, due to its high sensitivity and the large 
range of antigens that can be detected. However, the method 
suffers from numerous shortcomings, such as being subjec- 
tive as well as time and labour intensive. Computer Aided 
Diagnostic (CAD) systems have been developed to address 
these problems, which automatically classify a HEp-2 cell 
image into one of its known patterns (eg., speckled, homoge- 
neous). Most of the existing CAD systems use handpicked 
features to represent a HEp-2 cell image, which may only 
work in limited scenarios. In this paper, we propose a cell 
classification system comprised of a dual-region codebook- 
based descriptor, combined with the Nearest Convex Hull 
Classifier We evaluate the performance of several vari- 
ants of the descriptor on two publicly available datasets: 
ICPR HEp-2 cell classification contest dataset and the new 
SNPHEp-2 dataset. To our knowledge, this is the first time 
codebook-based descriptors are applied and studied in this 
domain. Experiments show that the proposed system has 
consistent high performance and is more robust than two 
recent CAD systems. 

1. Introduction 

In recent years, there has been increasing interest in 
employing image analysis techniques for various routine 
clinical pathology tests [9, 10, 12]. Results produced by 
these techniques can be incorporated into subjective anal- 
ysis done by scientists, leading to test results being more 
reliable and consistent across laboratories [10]. 

The Anti-Nuclear Antibody (ANA) test is commonly 
used by clinicians to identify the existence of Connective 
Tissue Diseases such as Systemic Lupus Erythematosus, 
Sjorgren's syndrome, and Rheumatoid Arthritis [16]. The 
hallmark protocol for doing this is through Indirect Im- 
munofluorescence (IIP) on Human Epithelial type 2 (HEp- 
2) cells [16, 29]. This is due to its high sensitivity and 
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Figure 1. Examples of strong positive ANA specimens. 



the large range expression of antigens. Despite its advan- 
tages, the HE method is labour intensive and time consum- 
ing [3, 20]. Each ANA specimen must be examined under 
a fluorescence microscope by at least two scientists. This 
also renders the test result more subjective, and thus has low 
reproducibility and large inter-/intra- personnel/laboratory 
variabilities [10, 24]. To address these issues, it is possible 
to use Computer Aided Diagnostic (CAD) systems which 
automatically determine the HEp-2 pattern in the given cell 
images of a specimen [6, 7, 10, 11, 19, 24, 25]. Examples 
of specimen images are shown in Eigure 1 . 

Properties of existing CAD systems in the literature are 
shown in Table 1 . Most of these systems have a common 
trend: they use carefully handpicked features which may 
only work in a particular laboratory environment and/or mi- 
croscope configuration. To address this, several approaches 
employ a large number of features and apply an auto- 
mated feature selection process [10]. Another approach 
uses Multi Expert Systems to allow the use of a specifically 
tailored feature set and classifier for each HEp-2 cell pat- 
tern class [24]. Nevertheless, the generalisation ability of 
these systems is still not guaranteed since these were only 
evaluated in one particular dataset with a specific setup. 



Table 1. Existing CAD systems for HEp-2 cell classification. 



Approach 



Descriptors 



Classifier 



FQmQretal. [19] 
Hiemann et al. [10] 
Elbischger et al [7] 
RsiQh etal. [11] 
Soda et al [24] 
Cordelli et al. [6] 
Strandmark et al. [25] 



Textural 

Structural; textural (1400 features) 

Image statistics; cell shape; textural (9 features) 

Image statistics; textural (8 features) 

Specific set of features (e.g. textural) for each class 

Image statistics; textural; morphological (15 features) 

Morphological; image statistics; textural (322 features) 



Decision Tree 

LogisticModel Tree 

Nearest Neighbour 

Learning Vector Quantisation (LVQ) 

Multi Expert System 

AdaBoost 

Random Forest 



One of the most popular approaches for automatic im- 
age classification, here called the codebook approach, is 
to express an image in terms of a set of visual words, 
selected from a dictionary that has been trained before- 
hand [13, 23, 28]. In order to model an image, the code- 
book approach divides the image into small image patches, 
followed by patch-level feature extraction. An encoding 
process is then employed to compute a histogram of vi- 
sual words based on these patches. Codebook-based de- 
scriptors often have higher discrimination power compared 
to the other image descriptors [13, 28, 30]. Thus, we ar- 
gue that better classification performance can be achieved 
by employing such descriptors for CAD systems. 

Contributions. In this work we propose the use of 
a dual-region codebook-based descriptor, specifically de- 
signed to exploit the nature of cell images, coupled with 
an adapted form of the Nearest Convex Hull classifier [17]. 
To our knowledge, this is the first time the codebook ap- 
proach is applied and studied for the HEp-2 cell classifica- 
tion task. We evaluate two methods for low-level feature 
extraction from image patches, SIFT [15] and DCT [23], 
in conjunction with three methods for generating the his- 
tograms of visual words: vector quantisation [28], soft as- 
signment [23] and sparse coding [30]. We furthermore pro- 
pose a new HEp-2 cell image classification dataset, denoted 
as SNPHEp-2, which allows the evaluation of the robust- 
ness of CAD systems to various hardware configurations. 
The number of images is much larger than the existing 
ICPRContest dataset [8]. 

We continue this paper as follows. We first delineate the 
HEp-2 cell classification task in Section 2. In Section 3 
we present the dual-region codebook-based descriptor. In 
Section 4 we overview the Nearest Convex Hull classifier. 
Section 5 is devoted to experiments and discussions. IVLain 
findings and future research avenues are given in Section 6. 

2. HEp-2 Cell Classification System 

Each positive HEp-2 cell image ^ is represented as a 
three-tuple {I,M,6) which consists of: (i) the Fluores- 
cein Isothiocyanate (FITC) image channel /; (ii) a binary 



cell mask image M which can be manually defined, or ex- 
tracted from the 4\6-diamidino-2-phenylindole (DAPI) im- 
age channel [10]; and (iii) the fluorescence intensity 5 e 
{strong, weak} which specifies whether the cell is a strong 
positive or weak positive. Strong positive images normally 
have more defined details, while weak positive images are 
duller. 

Let 1^ be a probe image Y = (/, M,^), and 
i be its class label. Given a gallery set Q = 
{(/, M, 6)f, (/, M, ^)f , . . . , (/, M, ^)^}, the task of a clas- 
sifier (p :Y X Q h^ iisto produce i, where ideally i ^ i. 

We consider six HEp-2 cell patterns [29] listed below; 
example images are shown in Fig. 2. 

(1) homogeneous: sl uniform diffuse fluorescence covering the en- 
tire nucleoplasm sometimes accentuated in the nuclear periph- 
ery 

(2) coarse speckled: densely distributed, variously sized speckles, 
generally associated with larger speckles, throughout nucleo- 
plasm of interphase cells; nucleoli are negative 

(3) fine speckled: fine speckled staining in a uniform distribution, 
sometimes very dense so that an almost homogeneous pattern 
is attained; nucloli may be positive or negative 

(4) nucleolar: brightly clustered larger granules corresponding to 
decoration of the fibrillar centers of the nucleoli as well as the 
coiled bodies 

(5) centromere: rather uniform discrete speckles located through- 
out the entire nucleus 

(6) cytoplasmic: a very fine dense granular to homogeneous stain- 
ing or cloudy pattern covering part or the whole cytoplasm 
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^ It is assumed that the cell images have been extracted from specimen 
images via an approach such as background subtraction [21]. 



Homogeneous Coarse Fine Nucleolar Centromere Cytoplasmic 

speckled speckled 

Figure 2. Sample images from ICPRContest dataset [8] and the 

proposed SNPHEp-2 dataset. 



3. HEp-2 Cell Image Descriptor 

The overall idea of the proposed HEp-2 cell image de- 
scriptor is shown in Fig. 3. Each cell is divided into small 
overlapping patches. The patches are then used to con- 
struct two histograms of visual words: inner and outer his- 
tograms, depending on whether the patches come from the 
the inside of the cell, or its edges, respectively. We first 
describe low-level patch-level features in Section 3.1, fol- 
lowed by presenting the dual-region structure in Section 3.2. 
In Section 3.3 we present several histogram encoding meth- 
ods. 

3.1. Patch-level Feature Extraction 

Given a HEp-2 cell image (/, M, 6), both the FITC im- 
age / and mask image M are divided into small over- 
lapping patches Vi = {pj^i, Pj2^ •••^ Pi,n} and Vm = 
{Pm,15 Pm,2 5 • • • 5 PM,n}- The division is accomplished in 
the same manner of both images, resulting in each patch in 
the FITC image having a corresponding patch in the mask 
image. Let / be a patch-level feature extraction function 
f : Pj \-^ X, where cc G M"^. Vi now can be represented as 

We selected two patch-level feature extraction tech- 
niques, based on the Scale Invariant Feature Transform 
(SIFT) and the Discrete Cosine Transform (DCT). The low- 
level SIFT descriptor is invariant to uniform scaling, orien- 
tation and partially invariant to affine distortion and illu- 
mination changes [15]. These attributes are advantageous 
in this classification task as cell images are unaligned and 
have high within class variabilities. DCT based features 
proved to be effective for face recognition in video surveil- 
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IVIasl< image 
Figure 3. Conceptual diagram for the proposed HEp-2 cell im- 
age descriptor. Both the FITC image and its corresponding mask 
image are divided into small overlapping patches. Patch-level fea- 
tures are extracted from FITC patches. Each FITC patch is then 
classified into either outer or inner region by using information ex- 
tracted from its corresponding mask patch. Finally, both inner and 
outer histograms are obtained by an encoder employing a learned 
dictionary of visual words. 



lance [23, 30]. By using only the low frequency DCT co- 
efficients (essentially a low-pass filter), each patch repre- 
sentation relatively robust to small alterations [23]. We fol- 
low the extraction procedures for SIFT and DCT as per [14] 
and [23], respectively. 

3.2. Dual-Region Structure 

We aim to model cell characteristics by building sepa- 
rate histograms for the inner region (which is often a uni- 
form texture) and the outer region (which contains infor- 
mation related to cell edges and shape). To this end, each 
FITC patch is first classified as either belonging to the inner 
or outer region by inspecting its corresponding mask patch. 
Fig. 4 shows how the regions are imposed on a cell image. 

Let X = Xo U Xi, with Xo representing the set of outer 
patches, and Xi the set of inner patches. The classification 
of patch pj into a region is done via: 

Xo ifri < fg(pM) < ^2 

Xi ifT2 < fg(PM) 



Pl^ 



(1) 



where p^ is the corresponding mask patch; fg(pM) com- 
putes the percentage of foreground pixels from mask patch 
Pm\ Ti is the minimum foreground pixel percentage of a 
patch belonging to the outer region; and T2 is the maxi- 
mum foreground pixel percentage of a patch belonging to 
the outer region, as well as the minimum pixel percentage 
of a patch belonging to the inner region. 

3.3. Generation of Histograms 

Let Xr be the set of patch-level features for either the 
inner or outer region (ie., Xr — Xi or Xr — Xo). For each 
patch-level feature Xj G AV, a local histogram hj is obtained 
by an encoding method. The overall histogram representa- 
tion for region r is then obtained via averaging [23, 30]: 



_ J_ \Xr\ 

- \Xr\ ^J = l ""' 



(2) 



where \Xr\ is the number of elements in set Xr. In this 
work we consider three popular histogram encoding meth- 
ods: (1) vector quantisation; (2) soft assignment; (3) sparse 
coding. The methods are elucidated below. 

Inner region 
Cell boundary 




Outer region 

Figure 4. Conceptual diagram for the proposed dual-region struc- 
ture. The green line is the cell boundary. The outer region is de- 
noted by the striped patterns, and inner region is the area denoted 
by the red arrow. 



3.3.1 Vector Quantisation (VQ) 

Given set P, a dictionary of visual words, the i-th dimension 
of local histogram hj for patch Xj is computed via: 







if i = argmin (j){k) = disi{x j,dk) 

kei,...,m (3) 

otherwise 



where dist(ccj , dfc) is a distance function between Xj and dk, 
with dk the k-th entry in the dictionary V. The dictionary is 
obtained via the /c-means algorithm [2] on training patches. 
The VQ approach is considered as a hard assignment 
approach since each image patch is only assigned to one 
of the visual words. This hard assignment is sensitive to 
noise [28]. 

3.3.2 Soft Assignment (SA) 

In comparison to the VQ approach above, a more robust 
approach is to apply a probabilistic method [23]. Here the 
visual dictionary P is a convex mixture of Gaussians. The 
i-th dimension of the local histogram for Xj is calculated as: 

_ WiPi{Xj) 



-m 



T.k=l^kPk{Xj 



(4) 



where pi (x) is the likelihood of x according to the i-th com- 
ponent of the visual dictionary: 



Pi{x) 



exp[-|(x-M.rC-^(x-Mj] 



(5) 



(27r)2 |C.|2 

with Wi, /Lt- and d representing the weight, mean vector 
and covariance matrix of Gaussian i, respectively. The 
scalar d represents the dimensionality of x. The dictio- 
nary V is obtained using the Expectation Maximisation al- 
gorithm [2] on training patches. 

3.3.3 Sparse Coding (SC) 

It has been observed that each local histogram produced via 
Eqn. (4) is sparse in nature (ie., most elements are close to 
zero) [30]. As such, it is possible to adapt dedicated sparse 
coding algorithms in order to represent each patch as a com- 
bination of dictionary atoms [5, 31]. 

A vector of weights a = [ai,a2, ...,an]^ is computed 
for each Xj by solving a minimisation problem that selects a 
sparse set of dictionary atoms. As the theoretical optimality 
of the ^i-norm minimisation solution is guaranteed [27], in 
this work we used: 



WDcx- 



^j\\l 



+ ^EJ 



ttfc 1 



(6) 



where || • ||p denotes the ^p-norm and D e M"^^"^ is a matrix 
of dictionary atoms. The dictionary D is trained by using 
the K-SVD algorithm [1]. 



As a can have negative values due to the objective func- 
tion in Eqn. (6), we construct each local histogram using the 
absolute value of each element in a [30]: 



hi 



[ |ai|, |tt2|, • • • , \0in\ ] 



(7) 



Compared to both Eqns. (3) and (4), obtaining the his- 
togram using sparse coding is considerably more computa- 
tionally intensive, due to the need to solve a minimisation 
problem for each patch. 



I H^^ , if 1^^ I represent the average inner 



4. Classifiers 

Let set Qx - ^ ^^"^ 
and outer histograms for cell image X. Below we describe 
two classifiers that we have adapted to use both the inner 
and outer regions: (1) nearest neighbour (NN), and (2) near- 
est convex hull (NCH). 

4.1. Nearest Neighbour (NN) 

The NN classifier assigns the class of probe image A to 
be the class of the nearest training image B. For the pur- 
poses of this classifier, and inspired by [26], we define the 
distance between images A and B as: 



d(0A,QB) = 7 



-W 






[i\\ 



+ {l-^)\\H'l^-H\ 



(8) 



where 7 G [0, 1] is a mixing parameter found during train- 
ing. 

4.2. Nearest Convex Hull (NCH) 

In NCH, each training class is approximated with a sim- 
ple convex model, or more specifically, the convex hull of 
the descriptors of the training images [17]. This reduces the 
sensitivity to within class variation, as "missing" samples 
can be approximated using the convex model [4]. 

Let Ac = I ^c ' ^c f denote the model of class C, com- 
prised of inner and outer components. In order to take into 
account both the inner and outer histograms, we define dis- 
tance between image A and class C as: 

d(QA,Ac) = 7dNCH(if!;^ng) + (l-7)dNCH(if!?^n[^^) (9) 

where 7 G [0, 1] is a mixing parameter found during train- 
ing, and dNCH{H,Q.c) is the distance between histogram 
H and convex model Qc, defined as: 



dNCH (H, Qc) = min \\H - u?||i, u? G He 



(10) 



where Qc is a set of points generated by a linear combi- 
nation of the training samples. Given a set of training his- 
tograms for class C, {if 1, if 2, • • • , Hm}, each member of 
Qc is defined as [17]: 



Em ^^ — ^m 

/3iHi, subject to 2^._ A = l 



(11) 



The above model implicitly treats any combination of 
histogram descriptors as a valid gallery example. 



5. Experiments 

In this section we first overview the two datasets used 
in the experiments. We then evaluate the six variants of 
the codebook-based descriptor, where each of two low-level 
feature extraction techniques (SIFT and DCT) is coupled 
with three possible methods for generating the histograms 
of visual words (VQ, SA, and SC). Finally we compare the 
best codebook-based variant against two recently proposed 
systems. The various systems were implemented with the 
aid of the Armadillo C++ library [22] . 

5.1. ICPR HEp-2 Contest Dataset 

The ICPR HEp-2 Cell Classification Contest Dataset 
(ICPRContest) [8] contain 1,457 cells extracted from 28 
specimen images. It contains six patterns: centromere, 
coarse speckled, cytoplasmic, fine speckled, homogeneous, 
and nucleolar. Each specimen image was acquired by 
means of fluorescence microscope (40-fold magnification) 
coupled with SOW mercury vapour lamp and with a CCD 
camera. The cell image masks were hand labelled. See 
Fig. 2 for examples. 

As the official test set is not yet publicly available, we 
use only the official training set to create ten-fold validation 
sets. The available images are divided into training and test- 
ing sets with 14 specimens each. As such, in each set, each 
pattern class only has 1-2 specimen images, where a spec- 
imen image contains a set of cells having the same pattern 
(see Fig. 1 for a visual representation). In total there are 
721 and 736 cell images extracted for training and testing, 
respectively. The validation sets were created by randomly 
selecting the images from the 721 cell images. Each fold 
contains 652 and 69 cell images for training and testing re- 
spectively. 

Note that due to the abovementioned limitation of avail- 
able images, each class can only have 1-2 specimen images. 
As such there is an assumed bias, as there is a high chance 
that cells extracted from the same specimen image exist in 
both the training and testing sets of each fold. 

5.2. SNP HEp-2 Cell Dataset 

The SNP HEp-2 Cell Dataset (SNPHEp-2) was obtained 
between January and February 2012 at Sullivan Nicolaides 
Pathology laboratory, Australia. The dataset^ has five pat- 
terns: centromere, coarse speckled, fine speckled, homoge- 
neous and nucleolar. The 18-well slide of HEP-2000 IIF 
assay from Immuno Concepts N.A. Ltd. with screening di- 
lution 1:80 was used to prepare 40 specimens. Each spec- 
imen image was captured using a monochrome high dy- 
namic range cooled microscopy camera, which was fitted 
on a microscope with a plan-Apochromat 20x/0.8 objective 



^The SNPHEp-2 dataset is available for download at 

http : //itee .uq. edu. au/ ~lovell/snphep2/ 



lenses and an LED illumination source. DAPI image chan- 
nel was used to automatically extract the cell image masks. 

There are 1,884 cell images extracted from 40 specimen 
images. The specimen images are divided into training and 
testing sets with 20 images each (4 images for each pattern). 
In total there are 905 and 979 cell images extracted for train- 
ing and testing. Five-fold validation of training and testing 
were created by randomly selecting the training and test im- 
ages. Both training and testing in each fold contain around 
900 cell images (approx. 450 images each). Examples are 
shown in Fig. 2. 

By using the SNPHEp-2 dataset in addition to the 
ICPRContest dataset, we obtain the following benefits: 
(1) the specimens of both datasets were not captured by the 
same microscope configuration (eg., the microscope's ob- 
jective lens magnitude for ICPRContest is 40x, while 20x 
for SNPHEp-2), allowing us to test the robustness of CAD 
systems to variations in image capture conditions; (2) there 
is no bias compared to the ICPRContest experiment setup, 
allowing for a more thorough evaluation. 

5.3. Codebook-Based Descriptor Variants 

In this section we evaluate the discriminative power of 
the codebook-based descriptor, with and without the dual- 
region structure. When the dual-region structure is not em- 
ployed, each cell image is represented by one histogram 
constructed from both the inner and outer patches. 

As there are three histogram encoding methods (ie., VQ, 
SA and SC) and two patch-level features (ie., SIFT and 
DCT), there are six variants of the codebook-based descrip- 
tor. For clarity, each variant is styled as: [patch-level fea- 
tures] -[histogram encoding method]. For example, the vari- 
ant using DCT as its patch-level features and VQ as its en- 
coding method is called DCT-VQ. 

The NN classifier was employed in this comparison, in 
order to reduce the total number of combinations. Based on 
preliminary experiments, we selected ^i-norm distance in 
Eqn. (8) to measure the distance between two images. All 
other hyperparameters of each approach were found in the 
training set of each cross-validation set. 

Table 2 presents the average Correct Classification Rate 
(CCR) for each descriptor variant on the ICPRContest and 
SNPEHEp-2 datasets, using both single- and dual-region 
configurations. We can observe that all variants have higher 
CCR on ICPRContest than on SNPEHEp-2, which is con- 
sistent with the bias in the ICPRContest dataset setup. 

The DCT-SA variant is more discriminative and robust to 
various hardware configurations, as it consistently outper- 
forms the other variants on both datasets. This high perfor- 
mance can be partly attributed to effect of soft-assignment, 
which can be more expressive than the other variants [28]. 

Generally, DCT has better performance than SIFT on 
most codebook-based variants on both datasets. The only 



Table 2. Performance comparison of codebook-based descriptor 
variants on the ICPRContest and SNPHEp-2 datasets, using the 
NN classifier. The scores are shown as average correct classifica- 
tion rate (in percentage) along with their standard deviations. SR 
= single region; DR = dual region. 



Descriptor 
Variant 



ICPRContest 

SR DR 



SNPHEp-2 

SR DR 



DCT-SA 


93.8 ± 2.2 


94.9 ± 2.1 


74.7 ± 3.6 


76.7 ± 2.5 


DCT-VQ 


89.5 ± 3.4 


90.6 ± 3.2 


72.3 ± 3.2 


73.6 ±2.1 


DCT-SC 


81.7 ±3.4 


84.8 ± 3.5 


59.9 ± 2.7 


63.6 ± 1.8 



SIFT-SA 
SIFT-VQ 
SIFT-SC 



80.1 ±3.8 
86.4 ± 4.0 
79.9 ± 3.3 



82.6 ± 3.5 
86.8 ± 3.7 
86.1 ±2.6 



56.6 ±3.1 59.2 ±2.5 

64.7 ± 2.3 64.9 ± 2.5 
66.4 ± 2.9 67.9 ± 2.5 



exception is on SNPHEp-2, where the SIFT-SC outperforms 
DCT-SC. This suggests that for this application, DCT is 
more suitable than SIFT for representing low-level patch 
features. 

The results also show that imposing a spatial structure 
(ie., using the dual-region setup in contrast to the single- 
region setup) increases the performance of all of the vari- 
ants, while generally reducing the standard deviation. 

5.4. Classifier Variants 

In this section we compare the performance of the NN 
and NCH classifiers on both datasets. We use the most dis- 
criminative descriptor found in Section 5.3, ie., DCT-SA. 
In addition, we have also evaluated the performance of two 
baseline descriptors: (i) raw image, where a raw cell image 
is vectorised, and (ii) rotation invariant Linear Binary Pat- 
terns (LBP) [18], using a configuration of 8 neighbours and 
1 pixel radius. For the baseline descriptors, the cell images 
were used without any further processing (eg., no spatial 
structure was imposed). 

The results, presented in Table 3, show that NCH gen- 
erally outperforms NN regardless of the descriptor being 
used. An exception is LBP on SNPHEp-2, where NN per- 
forms slightly better. In most cases the NCH classifier 
also provides the most improvement on the more difficult 
SNPHEp-2 dataset. with its performance on the ICPRCon- 
test dataset close to the performance of NN. The results also 
show that the proposed DCT-SA approach (in both single- 
and dual-region configurations) considerably outperforms 
LBP, especially on the SNPHEp-2 dataset. 

5.5. Comparative Evaluation of Systems 

In this section we compare the best performing 
codebook-based system found in Section 5.4, ie., dual- 
region DCT-SA coupled with the NCH classifier, against 
two recently proposed systems in Cordelli et al. [6] and 
Strandmark et al. [25]. 

We implemented the best reported descriptor in [6], 
which is comprised of features such as image energy, mean 



Table 3. Performance comparison of the NN and NCH classifiers 
on the ICPRContest and SNPHEp-2 datasets. SR = single region; 
DR = dual region. 



Descriptor 



ICPRContest 

NN NCH 



SNPHEp-2 

NN NCH 



DCT-SA + DR 
DCT-SA + SR 



94.9 ±2.1 

93.8 ±2.2 



95.5 ± 2.2 

94.3 ± 2.3 



76.7 ± 2.5 
74.7 ± 3.6 



80.6 ± 2.1 

78.5 ± 3.2 



LBP 

Raw Image 



85.8 ± 2.6 
39.8 ± 3.2 



86.4 ±3.1 

57.7 ± 4.5 



49.9 ± 3.8 
39.1 ±3.3 



47.6 ± 2.2 
43.1 ±4.2 




SNPHEp-2 



Q Raw Image + NCH Q LBP + NCH 

n Cordelli + LogitBoost D DCT-SA + SR + NCH 

U Strandmark + RandomForest Q DCT-SA + DR + NCH 



Figure 5. Performance comparison of various systems on the 
ICPRContest and SNPHEp-2 datasets. SR = single region; DR = 
dual region; NCH = Nearest Convex Hull Classifier. 



and entropy, calculated from intensity and LBP channels. 
The LBP channel is computed by computing the local pat- 
tern code for each pixel in the intensity channel. We se- 
lected Logistic Boosting (LogitBoost) as the classifier in- 
stead of AdaBoost as the former obtained better perfor- 
mance. We denote this system as Cordelli. 

We denote the system in [25] as Strandmark, and used 
the code provided by the authors. The system employs vari- 
ous image statistics features (eg., mean, standard deviation) 
and morphological features (eg., number of objects, area). 
The random forest classifier is used. 

The results are presented in Fig. 5. Both the Cordelli 
and Strandmark systems have reasonable performance on 
the ICPRContest dataset. Strandmark has slightly better 
performance than the proposed DCT-SA system (96.1% vs 
95.5%). However, both the Strandmark and Cordelli sys- 
tems perform poorly on the more challenging SNPHEp-2 
dataset, while the proposed system has considerably bet- 
ter performance. This indicates that the descriptors used 
by Cordelli and Strandmark systems are sensitive to hard- 
ware configuration variations. The poor performance of 
the Cordelli system on SNPHEp-2 can be partly explained 
from the observation that LBP also performs poorly on 
SNPHEp-2. As mentioned before, Cordelli uses the LBP 
channel to compute some of its features. 



6. Main Findings 
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The Indirect Immunofluorescence method on Human 
EpitheHal (HEp-2) ceUs is a haUmark method for identi- 
fying the presence of Anti-Nuclear Antibodies in cHnical 
pathology tests. Despite its high sensitivity and the large 
range of antigens that can be detected, it has numerous 
shortcomings, such as being subjective as well as time and 
labour intensive. Computer Aided Diagnostic (CAD) sys- 
tems have been recently developed to address these prob- 
lems, which automatically classify a HEp-2 cell image into 
one of the known patterns (eg., speckled, homogeneous). 
Most of the existing CAD systems use handpicked features 
to represent a HEp-2 cell image, which may only work in 
limited scenarios. 

In this paper we have proposed a cell classification sys- 
tem comprised of a dual-region codebook-based descriptor 
combined with the Nearest Convex Hull Classifier. The 
system splits a cell image into small patches, which are 
then grouped into sets representing the inner and edge re- 
gions of the cell. Each region is the described as a his- 
togram of visual words. To our knowledge, this is the first 
time codebook-based descriptors are successfully applied 
and thoroughly studied in the domain of cell classification. 

We evaluated numerous variants of the descriptor on two 
publicly available datasets: ICPR HEp-2 cell classifica- 
tion contest dataset and the new SNPHEp-2 dataset. We 
found that DCT patch-level features in conjunction with 
soft-assignment/probabilistic encoding of histograms leads 
to the highest discrimination performance. We also found 
that imposing the dual-region spatial structure increases dis- 
crimination performance of all codebook-based descriptor 
variants. Furthermore, the experiments show that the pro- 
posed system has consistent high performance and is more 
robust than two recent CAD systems presented in [6, 25]. 

We note that the proposed dual-region spatial structure 
used in this work is intuitive and lacks a theoretical explana- 
tion. Given the encouraging results, a more complete model 
of spatial structure could be developed to further increase 
performance. 
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